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Real-time financial surveillance via quickest 
change-point detection methods 
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We consider the problem of efficient financial surveil¬ 
lance aimed at “on-the-go” detection of structural breaks 
(anomalies) in “live”-monitored financial time series. With 
the problem approached statistically, viz. as that of multi- 
cyclic sequential (quickest) change-point detection, we pro¬ 
pose a semi-parametric multi-cyclic change-point detection 
procedure to promptly spot anomalies as they occur in the 
time series under surveillance. The proposed procedure is 
a derivative of the likelihood ratio-based Shiryaev-Roberts 
(SR) procedure; the latter is a quasi-Bayesian surveillance 
method known to deliver the fastest (in the multi-cyclic 
sense) speed of detection, whatever be the false alarm fre¬ 
quency. We offer a case study where we first carry out, step 
by step, a preliminary statistical analysis of a set of real- 
world financial data, and then set up and devise ("aj the pro¬ 
posed SR-based anomaly-detection procedure and (b) the 
celebrated Cumulative Sum (CUSUM) chart to detect struc¬ 
tural breaks in the data. While both procedures performed 
well, the proposed SR-derivative, conforming to the intu¬ 
ition, seemed slightly better. 

AMS 2000 SUBJECT CLASSIFICATIONS: Primary 62L10, 
62L15; secondary 62P05. 
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1. INTRODUCTION 

The world’s history of economic crises, including the lat¬ 
est and still-ongoing global financial meltdown and reces¬ 
sion that started in 2008-2009, provides graphic evidence 
of the importance of efficient methods for continuous fi¬ 
nancial surveillance [7, 8]. By allowing to detect anomalous 
patters early and reliably, such methods form a foundation 
for active risk management [20]. This paper examines the 
possibility of approaching the problem of financial moni¬ 
toring statistically. Specifically, the principal idea is to ex¬ 
ploit the machinery of sequential (quickest) change-point 
detection. The subject is concerned with the development 
and evaluation of “watch dog”-type of procedures for early 
yet reliable detection of unanticipated changes (structural 
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breaks) that may occur in the statistical profile of a “live”- 
monitored time series. For an introduction into the subject, 
see, e.g., [44, 59, 1, 36, 55], or [47, Part II], and the references 
therein. 

One of the first comprehensive expositions of nonpara- 
metric change-point detection methods for online financial 
surveillance was offered by Brodsky and Darkhovsky [2, 3]. 
More recently, the machinery of Singular Spectrum Analysis 
(SSA) has also been utilized in [10, 23, 58, 11]. In particu¬ 
lar, it was demonstrated via numerous case studies involv¬ 
ing intricate real-world data that the SSA-based version of 
Page’s [25] celebrated Cumulative Sum (CUSUM) “inspec¬ 
tion scheme” is able to efficiently detect changes of rather 
complicated structure (e.g., in the frequency of a periodic 
component of the time series of interest). 

However, nearly all of the research on the subject done 
to date revolves around only three change-point detec¬ 
tion methods: the Shewhart A-chart [40, 41], the CUSUM 
“inspection scheme” [25], and the Exponentially Weighted 
Moving Average (EWMA) chart [38]. Over the years, the 
three have de facto become the detection tools in applied 
sequential analysis, especially in quality control. Part of 
the reason is the methods’ simplicity, and another part 
is their theoretically established strong optimality proper¬ 
ties [24, 37, 29]. By contrast, the focus of this paper is on 
the Shiryaev-Roberts (SR) procedure [42, 43, 39, 44]. Al¬ 
though the SR procedure is only slightly “younger” than the 
CUSUM and EWMA charts, it has heretofore been largely 
neglected by practitioners as well as by statisticians. Con¬ 
sequently, examples of applications of the SR procedure to 
real-world data are extremely rare. However, the SR pro¬ 
cedure has been recently discovered [30, 31, 45] to possess 
strong optimality properties in Shiryaev’s [42, 43, 44] multi- 
cyclic setting, which is a setting adequate in many real-world 
applications. Motivated by this, the authors of [35, 52] have 
successfully applied the SR procedure in the area of cyber¬ 
security, namely for online detection of anomalies (caused, 
e.g., by intrusions) in computer networks. The present paper 
is intended to provide yet another example of an SR-type 
anomaly-detection algorithm capable of operating on real- 
world financial data. Due to the exact multi-cyclic optimal¬ 
ity of the SR procedure, the proposed algorithm is expected 
to compare favorably to other detection schemes, in partic¬ 
ular the multi-cyclic CUSUM procedure. 

We would like to remark that, to the best of our knowl¬ 
edge, the only other attempt to apply the SR procedure to 






real-world financial data would be that made previously by 
Ergashev [6]. Specifically, Ergashev [6] was concerned with 
the problem of early detection of the “turning points” in 
the US business cycles. These cycles, also known as the US 
economic cycles, are alternating periods of recession and re¬ 
covery, manifested in fluctuations of the US economic activ¬ 
ity around its long-term potential level. Hence, the “turn¬ 
ing points” effectively signify the onset of either recession 
(contraction) or recovery (expansion) of the US economy. 
To detect these “turning points”, Ergashev [6] applied the 
SR procedure and the CUSUM and EWMA charts to the 
series of Composite Leading Indicators (CLIs); the CLIs 
are updated monthly by the Organisation for Economic 
Co-operation and Development (OECD; see on the Web 
at http: //www . oecd. org) to provide early signals of “turn¬ 
ing points” in the US business cycles. Through experiments 
involving the actual CLIs series, Ergashev [6] demonstrated 
the SR procedure to be better (i.e., quicker) at detecting 
the US business cycles’ “turning points” than the CUSUM 
and EWMA charts with the same level of the “false positive” 
risk. In this work we too provide experimental evidence that 
the SR procedure might be superior to the CUSUM chart 
when it comes to detecting structural breaks in time series 
of real-world stock prices. 

The rest of the paper is organized as follows. We start 
in Section 2 with a brief introduction to the area of quick¬ 
est change-point detection and provide a short overview of 
the state-of-the-art in the field. Next, in Section 3 we of¬ 
fer an SR-based anomaly-detection algorithm suitable to 
operate on real-world data. Section 4 is devoted to a case 
study where we devise the proposed algorithm to perform 
anomaly-detection in a real-world financial time series. The 
conclusions follow in Section 5 which sums up the entire 
paper. 

2. PRELIMINARY BACKGROUND ON 
QUICKEST CHANGE-POINT 
DETECTION 

The aim of this section is two-fold: to provide a short 

but formal introduction to the problem of quickest change- 
point detection and (b) to give a brief account of the state- 
of-the-art in the field. This is necessary as background for 
the later sections. Eor lack of space, we shall only consider 
the basic iid version of the quickest change-point detection 
problem. For a thorough treatment the general non-iid case, 
see, e.g., [44, 49, 34] or [47, Part II]. 

Suppose one is able to sequentially observe a time series, 
{Xn}n>i, where A^’s are independent. Suppose further that 
the statistical structure of the series is such that Xi ,..., X^, 
are each distributed according to a known probability den¬ 
sity function (pdf) /(a;), while Ai,+ 2 ,... each have a 

pdf g{x) ^ f{x), also known. The basic iid quickest change- 
point detection problem is to detect, as one gathers more 
and more data, that the baseline pdf of the data is no longer 


f{x), and do so in an optimal manner. The challenge is that 
the time index which is referred to as the change-point, 
is not known in advance and may take place at any time 
0 < < oo; here and onward, the notation v = 0 {v = oo) 

is to be understood as the case when the change is in ef¬ 
fect from the get-go (or never, respectively). The minimax 
version of the problem assumes that v is unknown (but not 
random). This is different from the Bayesian version of the 
problem which regards v as random [42, 43, 44]. In this work 
we shall focus only on the minimax case. 

Statistically, the problem is to sequentially test the hy¬ 
potheses Hk- i' = k, 0<k<oo (i.e., that the pdf of the 
observations changes at epoch k) against the alternative hy¬ 
pothesis Hoo '■ V = oo (i.e., that the pdf never changes); note 
that Hi DHj = 0, i U and that Uj^oHj = U. 

The first step to test Hk against Hoc, is to construct the 
corresponding likelihood ratio (LR). To that end, assuming 
Xi,X 2 ,..., Xn have been sampled, the LR is of the form 

Afc:„ = n Aj, where Aj = 

for k < n and Ak-.n = 1 for k > n; the latter condition merely 
means that the change has not yet happened. The sequence 
{Afc:„}i<fc<„ has to be updated “on-the-go” incorporating 
new data points as they become available. 

Once constructed, the LR is turned into a detection statis¬ 
tic to be subsequently used for actual decision-making. Bas¬ 
ing the detection statistic on the LR ensures that the former 
is sensitive to whether the sample drawn so far is statisti¬ 
cally homogeneous or not. There are generally two funda¬ 
mentally different ways to utilize the LR to design a “good” 
detection statistic: either exploit the maximum likelihood 
principle or take the (generalized) Bayesian approach. This 
is shown schematically in Figure 1. 



Figure I: Two different approaches to statistical inference: 
maximum likelihood and (generalized) Bayesian. 

The idea of the maximum likelihood approach is to 
sequentially maximize {Ak:n}i<k<n with respect to the 
change-point v = k, where k = 1,2,... ,n. Specifically, the 
corresponding detection statistic is 

(I) 14 = max Ak:n, n>l, 

l<k<n 

which is the famous CUSUM statistic [25]. We note that 
the maximization with respect to k in the right-hand side 
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of ( 1 ) is possible because the change-point u = k is assumed 
unknown (nonrandom). 

By contrast, the Bayesian approach treats the change- 
point as a random number, possessing a certain prior dis¬ 
tribution [9, 42, 43, 49, 34]. However, since we agreed to 
assume that v is unknown (nonrandom), the corresponding 
quasi-Bayesian (or generalized Bayesian) detection statistic 
can be defined as 

n 

( 2 ) = n>l, 

i.e., Rn is effectively the average of {A.k:n}i<k<n taken with 
respect to the change-point iy = k, l<k<n assuming 
that it follows an (improper) uniform prior distribution; see, 
e.g., [9, 42, 43, 49, 34]. 

Statistics (1) and (2) are the two main choices in all 
of quickest change-point detection. Both lead to efficient 
sequential detection procedures. Specifically, a sequential 
detection procedure is identified with a stopping time, T, 
which is a functional of the observed data, {Xn}n>i- The 
meaning of T is that after observing Xi,..., Xt it is de¬ 
clared that apparently the change is in effect. This need not 
be the case, and if it is not the case, then T < v and the 
detection procedure T is said to have sounded a false alarm. 
A “good” (i.e., optimal or nearly optimal) detection proce¬ 
dure is one that minimizes (or nearly minimizes) the desired 
detection delay penalty-function, subject to a constraint on 
the false alarm risk. For an overview of the major optimality 
criteria, see, e.g., [49, 34, 32, 55], or [47, Part II]. 

Let Pfe(-); 0 < k < oo, denote the probability measure 
assuming that i/ = k,Q<k<oo (so that Poo(’) corresponds 
to the case when n = oo). Let Efc[-], 0 < fc < oo, be the 
corresponding expectation. 

Page [25] and then also Lorden [21] proposed to mea¬ 
sure the “false alarm” risk through the Average Run Length 
(ARL) to false alarm ARL(T) = Eoo[T]- This metric cap¬ 
tures the average number of observations that the procedure 
samples before it triggers a false alarm. The higher (lower) 
the level of the ARL to false alarm, the lower (higher) the 
actual level of the “false alarm” risk. 

A practical approach to quantify the detection speed is 
to use the “worst-case” (Supremum) Average Delay to De¬ 
tection (ADD), conditional on a false alarm not having been 
previously occurred, i.e., 

SADD(T) = max ADDfc(r), 

0<fc<oo 

where ADDfc(T) = Efe[T—fc|T > k], 0 < k < oo. This metric 
was introduced by Poliak [26]. 

Let 

A( 7 ) 4 |t: ARL(T)> 7 }, 7 > 1, 

i.e., be the class of procedures with the ARL to false alarm 
of at least 7 > 1, an a priori chosen level. Then Poliak’s [26] 


minimax quickest change-point detection problem is to find 
Topt S A( 7 ) such that SADD(Topt) = inf 7 ’gA( 7 ) SADD(T) 
for all 7 > 1. This problem is still an open one, and al¬ 
though there has been a continuous effort to solve it, the 
exact solution has been obtained in only two special cases 
(see [33, 51]) and, in general, only asymptotic (as 7 —)■ 00 ) 
solutions have been obtained so far [26, 50]. 

As was mentioned earlier. Page’s [25] CUSUM chart has 
been one of the main tools for change-point detection. Part 
of the reason is the fact that the CUSUM chart is strictly 
minimax with respect to Lorden’s [21] criterion for every 7 > 
1; see [24, 37]. The CUSUM chart is based on the maximum 
likelihood principle: it iteratively maximizes £„ = logA„, 
i.e., the log-likelihood ratio (LLR), with respect to the 
change-point and stops as soon as the running maximum 
exceeds a certain threshold. More specifically, the CUSUM 
chart is based on the statistic Wn — max{ 0 , log U„}, where 
Vn is as in (1). Note that Wn satisfies the recurrence 

(3) Wn = max{0, W„_i + £„}, n > 1, VFo = 0. 


The corresponding stopping rule is 

(4) C/i = min{n > 1: Wn > h}, 


where h > 0 is a detection threshold preset so as to achieve 
the desired level 7 > 1 of the ARL to false alarm, and thus 
guarantee Ch € A('y). Since ARL(C?i) > for any h > 0 
(see [ 21 ] for a proof), setting h = h.y > logy is sufficient to 
ensure Ch G A ( 7 ). A more accurate approximation (men¬ 
tioned, e.g., in [35]) for ARL(C/i) is as follows: 


( 5 ) 


ARL(C;,) 



where If = — Eoo[£i] and Ig = Eo[£i] denote the Kullback- 
Leibler information numbers (here and throughout the rest 
of this section it is to be assnmed that 0 < // < 00 and 0 < 
Ig < 00 ). The indices If and Ig that appear in the right-hand 
side of (5) are quantitative measures of the “contrastness” 
of the change, and play an important role in change-point 
detection. 

To define let {Zn}n>o be the random walk Zn = 
^ ^ 1; with Zq = 0. For a > 0, introduce the 

one-sided stopping time Tq = inf{n > 1. Zn > a} and let 
Ka = Zr^—a denote the overshoot (i.e., the excess of Zn over 
the level a at stopping). Then ( = lima_,.oo Eo[e“''“], which 
is the limiting exponential overshoot. This model-dependent 
constant falls within the scope of nonlinear renewal theory, 
and it can be shown that 


(6) C 


1 

— exp 

^9 


^-[Poo(^fc>0)+Po(Zfc<0)] 


cf., e.g., [57, Chapters 2 & 3] and [46, Chapter VIII]. 
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Define also >c = lima_>oo Eq [«^a] > which is the limiting 
overshoot. By methods of nonlinear renewal theory it can 
also be shown that 

cf., e.g., [57, Chapters 2 & 3] and [46, Chapter VIII]. In 
practice, ^ and >£■ are usually computed numerically using ( 6 ) 
and (7), respectively. 

It can be shown (see, e.g., [46]) that for the basic iid 
change-point problem SADD(C/i) = Eo[C?i]. Let h = h^, 
where h-y is the solution of the equation ARL(C/j^) = 7 . 
Then 

( 8 ) SADD(C;i^) = + /3o) + o(l) as 7 —>■ 00 , 

where Po = Eo[min„>o This property of the CUSUM 
chart is known as second order asymptotic SADD(T)- 
optimality. Expansion ( 8 ) was first obtained in [4] for the 
single-parameter exponential family. However, it holds in a 
more general case as well, as long as certain mild conditions 
imposed on Ci are satisfied. See [48], where it is also shown 
that 

(9) lim ADDkiCh) = ^ {h^ + k- P^) + o{l) as 7 -)► 00 , 

/c—>-00 Ig 

where /3oo = lim„^oo Eoo [^n - niino<fc<„ ^fe]. In practice, 
constants Pq and P^o are also usually computed numerically 
(e.g., by Monte Carlo simulations). We also note that the 
two asymptotics ( 8 ) and (9) are inversely proportional to 
the Kullback-Leibler information number Ig . This number 
is sensitive to how faint or contrast the change is. Specifi¬ 
cally, Ig is small for faint changes, and is large otherwise. 
Therefore, according to ( 8 ) and (9), the average delay to 
detection turns out to be large for faint changes and small 
otherwise, which makes perfect sense. 

Consider now a context in which it is of utmost impor¬ 
tance to detect the change as quickly as possible, even at 
the expense of raising many false alarms (using a repeated 
application of the same stopping rule) before the change oc¬ 
curs. Put otherwise, in exchange for the assurance that the 
change will be detected with maximal speed, we agree to go 
through a “storm” of false alarms along the way (the false 
alarms are ensued from repeatedly applying the same detec¬ 
tion rule, starting from scratch after each false alarm). This 
scenario is shown in Figure 2. 

Formally, let Ti, T 2 ,... be sequential independent repeti¬ 
tions of the stopping time T, and \etTj = T1+T2 + ■ ■ ■ + Tj , 
_) > I, be the time of the j-th alarm. Define = min{j > 

Tj > v}. h\ other words, is the time of detection of a 
true change that occurs at v after — I false alarms have 
been raised. Write 

STADD(T) A hm ¥.^[T^ - v] 

I/—)-00 ^ 


for the limiting value of the average delay to detection 
referred to as the Stationary Average Delay to Detection 
(STADD). The multi-cyclic change-point detection prob¬ 
lem is to find Topt S A ( 7 ) such that STADD (Topt) = 
infTGA( 7 ) STADD(T) for every 7 > 1. Since in this setup 
ARL(T) is effectively the average distance between succes¬ 
sive false alarms, the reciprocal 1/ARL(T) can be inter¬ 
preted as the frequency of false alarms. The “intrinsic as¬ 
sumption” of the multi-cyclic change-point detection prob¬ 
lem is that the process under surveillance is not expected to 
be affected by change “for a while”, i.e., the change-point, 
V, is large. This is a reasonable assumption, e.g., in the area 
of computer network anomaly detection (see, e.g., [35, 52]) 
and in financial surveillance. 

As has been shown in [30, 31, 45], the Shiryaev-Roberts 
(SR) procedure [42, 43, 39] is exactly optimal for every 
7 > 1 with respect to the stationary average detection delay 
STADD(r). Thus, in the multi-cyclic setting the SR proce¬ 
dure is a better alternative to the popular CUSUM chart. 

The SR rule calls for stopping at epoch 

(10) Sa — min{n > 1: > A}, 

where the SR statistic {i?n}n>o is given by the recursion 

( 11 ) i?„ = (l + i?„_i)A„, n>l, Ro = 0; 

cf. [42, 43] and [39]; here A > 0 is a detection threshold set 
a priori so as to ensure Sa G A ( 7 ) for a desired 7 > 1. It can 
be easily shown [27] that ARL(iS^) > A for all A > 0. Hence, 
setting A.y = 7 is sufficient to guarantee Sa G A ( 7 ). A more 
accurate asymptotic approximation is ARL(5 a) « A/(C, as 
A —>• 00 ; see [27]. 

Let Rea be a random variable that has the Poo-limiting 
(stationary) distribution of i?„ as n ^ 00 , i.e., ( 5 st( 2 ;) = 
lim„_>oo Eoo(-Rn < x) = Poo(Roo < x). Let U = 
and Qix) = Po(U < x). 

A straightforward argument shows that, for the SR pro¬ 
cedure considered under the basic iid change-point setup, if 
A = A.y is the solution of the equation ARL(5y!i ) = 7 , then 
SADD(5 a.^) = Eo[5a.,], and 

(12) SADD(5a.^) = ^(log A.y-I-XT—Co)-I- 0 ( 1 ) as 7 —)■ 00 , 
where 

poo 

Co A E[log(l + t/)] = / log(l + x)dQ(x); 

Jo 

cf. [50]. The asymptotic expansion (12) shows that the SR 
procedure is also asymptotically second-order SADD(T)- 
minimax. In general, constant Co and distribution Q{x) 
are amenable to numerical treatment. For cases where both 
can be computed analytically and in a closed form see [50] 
and [34]. 
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(a) An example of the behavior of a process of interest with a change in mean at time u. 



Run Length to False Alarm, Ti 
(random) 

(b) Typical behavior of the detection statistic in the multi-cyclic mode. 


Figure 2: Multi-cyclic change-point detection in a stationary regime. 


For the multi-cyclic setting we have 
STADD(5 a.^) = -h(logAT, -I- XT - Coo) + o(l) as 7 -)■ oo, 

^9 


where 


Coo = E[log(l + R00 + u)] 

log(l +x + y) dQsT{x) dQ{y)] 


nOO pOO 


/o Jo 


cf. [50], 

We conclude this section with a remark that the exact 
multi-cyclic optimality property of the SR procedure (11)- 
( 10 ) depends heavily on the assumption that the pre- and 
post-change densities f{x) and g{x) are fully known. The 
consequences of setting up the SR procedure to detect the 
“wrong” change have been recently made clear in [5] where, 
apparently for the first time in the literature, it was demon¬ 
strated experimentally that, if ignored altogether, paramet¬ 
ric uncertainty in g{x) may severely affect the STADD de¬ 
livered by the SR procedure: the relative loss in performance 
can be on the order of hundreds of percent. 


3. APPLICATION TO FINANCIAL 
SURVEILLANCE 

Since anomalous events in financial series happen at un¬ 
known points in time, and entail changes in the series’ statis¬ 
tical properties, it is intuitively appealing to devise a quick¬ 
est change-point detection method to detect the onset of 
such changes as rapidly as possible, while maintaining the 
false alarm risk at a tolerable level. This section is intended 
to show how quickest change-point detection can be applied 
to detect anomalies in “live” streams of financial data. 

The main difficulty in applying either the CUSUM 
chart (3)-(4) or the SR procedure (Il)-(IO) to real-world 
data is that the pre- and post-anomaly distributions of the 
data are poorly understood, if known at all. As a result, any 
LR-based approach is effectively rendered useless. Hence, a 
nonparametric approach might be in order. To that end, let 
us first analyze how the LR exploited by both the CUSUM 
chart (3)-(4) and the SR procedure (Il)-(IO) allows the two 
procedures to sense the presence of a change. To that end, 
consider the behavior of £„ = logA„ prior to the change 
and under the change. Before the change, the LLR has 
a negative expectation, i.e., Eoo[£n] < 0. This causes the 
CUSUM statistic to gravitate toward zero in the pre-change 
regime, and causes the SR statistic to grow slower than it 
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would if the process had already undergone a change. How¬ 
ever, as soon as —the first “anomalous” data point— 
is recorded, the expectation of the LLR switches its sign 
to positive, i.e., Ey[£„] > 0, 0 < ^ < n. As a result, the 
CUSUM statistic starts to drift away from zero up toward 
the detection threshold, and the SR statistic’s claim rate 
increases compared to what it would be had there been no 
change. This difference in the behavior of each one of the two 
statistics under the pre-change regime and under the post¬ 
change regime is the main reason why the CUSUM chart 
and the SR procedure are able to sense the presence of a 
change in the observations to begin with. 

The above suggests that when it is impossible to con¬ 
struct a LR, the latter can be replaced with a computable 
score function Sn = S'„(Ai,... ,X„) such that Eoci-Sn] < 0 
for all n > 1 and Ej^[S'„] > 0 for all 0 < iz < n with n > 1. 
This is the key element of the nonparametric approach, and 
in the context of quickest change-point detection this idea 
has been previously suggested and explored, e.g., by McDon¬ 
ald [22], Lai [18], Gordon and Poliak [12, 13], and recently 
also by Poliak [28]. A thorough exposition of the nonpara¬ 
metric approach to change-point detection has been offered 
by Brodsky and Darkhovsky [2, 3]. 

To be more specific, McDonald [22] suggested to base 
surveillance on the series of sequential ranks = 

^{Xk<x„} where Ipj denotes the indicator function. 
The corresponding score function can be taken to be of 
the form Sn = Un — C, where C > 0 is a design param¬ 
eter selected according to the expected type of change and 
the desired level of the ARL to false alarm. That is, Mc¬ 
Donald’s [22] version of the CUSUM chart (3)-(4) signals 
an alarm according to the stopping time = min{n > 
1: W* > h}, where W* = max{0,W'*_;^ -I- Sn}, n > I, 
and h > 0 is the detection threshold. If the observations 
{Xn}n>i are all iid, then the sequential ranks Un are ap¬ 
proximately uniform, whatever be the observations’ com¬ 
mon baseline distribution. However, if effective the i/-th data 
point, Xi,, the baseline distribution switches to a stochasti¬ 
cally larger distribution, the sequential ranks become larger 
causing the rank-based CUSUM chart to trigger an alarm. 
This idea of McDonald [22] was then extended to the SR 
procedure by Gordon and Poliak [12, 13] and by Poliak [28]. 

More generally, for any appropriately designed score func¬ 
tion Sn, the original SR statistic {i?n}n>o given by (11) can 
be replaced with 

(13) = (1 -I- Rn-i)e^^, n > 1, Rq = 0, 

so that the corresponding SR stopping time is the form 

(14) Sa — min{n > 1: .R„ > A}, 

where A > 0 is the detection threshold. Likewise, for 
the CUSUM chart, the original CUSUM statistic {Wn}n>o 
given by (3) can be replaced with 

(15) Wn = max{0, lU„_i + n > 1, 1U„ = 0, 


so that the corresponding CUSUM stopping time becomes 

(16) Ch = min{n > 1 : 1 U„ > h}, 

where h > 0 is again the detection threshold. 

In order for the score-function-based SR procedure (13)- 
(14) and CUSUM chart (15)-(16) to work well, the score 
function Sn = S'„(Xi,..., A„) has to be carefully de¬ 
signed, incorporating the type of change expected. To il¬ 
lustrate this point, suppose we are interested in detect¬ 
ing a change in both the mean and variance of the ob¬ 
servations. Let /Too = Eoo[A„] and = Varoo[Ai„], and 
/j, = Eo[A„] and = Varo[A„] denote the pre- and post¬ 
anomaly mean values and variances, respectively. Introduce 
Xn — (Xn — Moo)/^^, i-e., the centered and standardized 
n-th data point. In real-world applications the pre-change 
parameters /j-oo and can usually be estimated in advance 
(e.g., using training data) and then periodically re-estimated 
to account for the nonstationary nature of the data. To deal 
with the uncertainty in /i and consider the following 
linear-quadratic score function 

(17) Sn(Xn) = CiXn + C 2 XI - C 3 , 

where Ci, C 2 and are design parameters; cf. [52]. Se¬ 
lecting Cl, C 2 and C 3 to be positive would make this score 
function sensitive to increases in the mean and variance. In 
the case when the variance either does not change at all 
or changes relatively insignificantly compared to the magni¬ 
tude of the change in the mean, the coefhcient C 2 may be 
set equal to zero. This appears to be typical for many cyber¬ 
security applications [53, 54, 52]. In the opposite case when 
the mean changes only slightly compared to the variance, 
one may take Ci = 0 . 

Note that the score function Sn given by (17) with 

(18) Ci=6q^, C2 = ^p-, C 3 = ^-logg, 

where q = CToo/ct, S = {fj, — fJ,oo)/o'oo, is optimal if the pre- 
and post-change distributions are Gaussian with known fj, 
and cr^. This is true because the score function Sn given 
by (17) is then simply nothing but the LLR. If one has reason 
to believe that the time series of interest can be accurately 
described by the Gaussian model, then selecting q = qo and 
S = Sq with some design values qo and dp would lead to 
decent performance of the procedure for q < qo and <5 > do 
and optimal (i.e., best) performance for q = qo and d = dg- 
However, it is important to emphasize that the proposed 
score-based “tweak” of SR procedure does not require the 
observations to be Gaussian, whether pre- or post-change. 

For examples of score-functions that exploit SSA, see, 
e.g., [10, 23, 58, 11]. 

Another way to deal with parametric uncertainty in the 
observations’ post-change distribution is to employ the gen¬ 
eralized likelihood ratio (GLR) approach. However, the ob- 
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vious problem with this approach is that the recursive eval¬ 
uation of the running LR—either as in (1) or as in (11)— 
might get computationally too difficult to carry out, be¬ 
cause now the LR has to be also maximized with respect 
to the unknown parameter. As a way around this, Will- 
sky and Jones [56] and then also Lai [18, 19] suggested to 
restrict attention to a certain limited number of the most 
recent observations, and, based on that idea, introduced the 
appropriate “window-limited” modihcation of the CUSUM 
chart. The main question here, however, is how to choose the 
size of the window, i.e., the optimal number of the most re¬ 
cent observations to take into account. On the one hand, 
if that number is too large, the corresponding “window- 
limited” CUSUM statistic might still be too computation¬ 
ally demanding. On the other hand, basing the decision on 
too small a number of the latest observations is likely to lead 
to an increase in the detection delay. To optimize the trade¬ 
off between the computational tractability and the speed of 
detection, Lai [18, 19] showed that the “best” strategy is to 
factor in the latest observations with being of the 
order Oilog'^/Ig) where 7 > 1 is the desired level of the 
ARL to false alarm and Ig = Eo[>Ci] is the Kullback-Leibler 
information number. 

4. A CASE STUDY 

We now consider a case study where we employ the pro¬ 
posed change-point detection methodology to “sniff out” 
structural breaks in a real-world financial time series. Specif¬ 
ically, our intent is two-fold: to first provide the steps neces¬ 
sary to configure our change-point detection procedures and 
then, once the latter are properly set up, to also demonstrate 
and discuss their performance. 

4.1 Data description 

The time series we would like to study is the daily stock 
prices (at closing) of Host Hotel & Resorts, Inc. (see on 
the Web at www.hosthotels.com) for the period from Jan¬ 
uary 3, 2000 through March 30, 2007. Host Hotel & Resorts, 
Inc. is the largest American lodging and real estate invest¬ 
ment trust (or REIT) headquartered in Bethesda, Mary¬ 
land, USA. An S&P 500 and Fortune 500 company, Host 
Hotel & Resorts, Inc. is also one of the biggest owners of 
luxury and upper-upscale hotels. Its hotels are operated un¬ 
der such reputable brand names as Marriott, Ritz-Carlton, 
Four Seasons, Hyatt and Hilton. Its stock is traded on the 
New York Stock Exchange (NYSE) under the ticker HST. 
Our interest in the company is due to its leading position 
in the industry and the significant size of its assets: as of 
December 31, 2014, its reported total assets were over $ 12 
billion (with liabilities and debt totaling to about $ 4.6 bil¬ 
lion) [17, p. 88 ]. 

Historical data for the HST stock for any period since the 
stock began trading on the NYSE are freely available on the 
Internet (e.g., via Yahoo.Finance; see www.fincince.yahoo. 


com). We used the Machine Learning Data Set Repository 
(see on the Web at www.mldata.org). The total length of 
the series is N = 1812 data points. The choice to focus on 
the period between January 3, 2000 and March 30, 2007 was 
because the company’s history was very eventful during that 
time period: the tragic events that took place in New York 
City on September 11, 2001, and the decade-long global eco¬ 
nomic unrest that followed caused considerable turbulence 
in the company’s financial well-being. As a result, one would 
expect the HST stock statistical dynamics within the cho¬ 
sen time frame to experience multiple changes. This makes 
change-point analysis of the data both interesting and chal¬ 
lenging. 

4.2 Preliminary statistical analysis 

To perform basic statistical analysis of the data, the nat¬ 
ural point of departure would be to graph the data against 
time. This is done in Figure 3. A mere eye examination of 
the plot suggests several observations. 



Figure 3: Daily stock prices (at closing) for the Host & Hotel 
Resorts, Inc. (NYSE: HST) for the period from January 3, 
2003 through March 30, 2007. 

First note that, as expected, the series appears to be rife 
with structural breaks of various scale and type. The follow¬ 
ing three are particularly notable: one occurring toward the 
end of the third quarter of the year 2001 , followed by one 
more occurring around the end of the first quarter of the 
year 2002, followed by yet another one occurring in 2003, 
around the end of the first quarter. 

The first of these break-points, viz. the one occurring in 
2001 , appears to be a crash-type event, as at that point 
the stock price essentially plummets, from being about 
$ 13/share right before the break to being roughly $ 7/share 
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shortly after the break. The reason for such a huge loss in 
value is not hard to figure out: it was the result of the 9/11 
terrorist attacks on the World Trade Center (WTC) Tow¬ 
ers in New York City. Specifically, in addition to destroying 
the Towers, the attacks also destroyed the New York World 
Trade Center Marriott hotel owned and operated by Host 
Hotel & Resorts, Inc. To boot, the company also sustained 
considerable damage to its second property located nearby, 
the New York Marriott Financial Center hotel. However, 
by the end of 2001, the company received the property and 
business interruption insurance for the two hotels [14], and 
the stock began to claim up. 

The second of the above three major break points, 
namely, the 2002 one, also appears to be a negative event 
in the Company’s history. According to the company’s 2002 
annual report [15], the company’s revenue for the year was 
negatively affected by the overall weakness of the US and 
global economies, which in particular resulted in business 
and leisure travel dropping below historic level in 2002. 

The third break-point (the one occurring around the first 
quarter of the year 2003) appears to be a “turning point” for 
the company, because following this break-point, the stock 
begins to exhibit a consistent upward trend that lasts for 
years. The specific date of this “turning point” is March 
14, 2003. According the company’s 2003 annual report [16], 
2003 was indeed a year of recovery for the company: they 
collected additional insurance on the hotels that were de¬ 
stroyed during the 9/11 attacks in 2001, sold hotels that 
had been found to be inefficient, and used the proceeds to 
substantially lessen the corporate debt. 

Another observation that can be made from Figure 3 is 
that the HST stock appears to have a seasonal component. 
This should not come as a surprise, since for the hotel in¬ 
dustry seasonal effects are common and, in fact, natural. 
However, dealing with such effects statistically is somewhat 
orthogonal to the objective of our study. Nevertheless, we 
would like to mention that the numerous and extensive case 
studies offered, e.g., in [10, 23, 11], suggest that the SSA 
methodology can be rather efficient in the analysis of sea¬ 
sonal and cyclic patterns. 

To reinforce the observations made so far. Figure 4 shows 
the behavior of the daily returns di = Yi+i — Xi, i = 
1,... ,N — 1, on the HST stock. The returns provide a dif¬ 
ferent prospective onto the behavior of the stock itself. As 
a matter of fact, it is the returns that are usually used as 
the input data to perform statistical inference on the under¬ 
lying stock itself. Therefore, we also shall proceed with the 
returns being the series of interest. 

One can clearly see a large down-pointing spike around 
the third quarter of the year 2001. This spike corresponds 
to the HST stock loosing approximately half of its value 
as the result of the 9/11 terrorist attacks in NYC. While 
this spike is extremely contrast, there is no apparent change 
in the daily return distribution corresponding to the 2003 
structural break. Nevertheless, as we shall see shortly, the 



Date 


Figure 4: Daily returns on the stock (evaluated at closing) of 
the Host & Hotel Resorts, Inc. (NYSE: HST) for the period 
from January 3, 2003 through March 30, 2007. 


2003 break-point is detectable. More importantly, in spite 
of the steady growth of the stock after the 2003 break-point 
shown in Figure 3, the behavior of the return series does not 
confirm any shift in the mean. 

4.3 Offline structural break detection 


We now perform a more thorough statistical analysis 
of the data. Specifically, we would like to devise a sta¬ 
tistical procedure to detect the aforementioned structural 
breaks. Toward this goal, the first step is to analyze the 
series retrospectively so as to not only detect the changes, 
but also to estimate their locations. One such “offline” 
change-point detection-estimation statistic is the Brodsky- 
Darkhovsky statistic proposed and studied in [2, 3]. The 
Brodsky-Darkhovsky statistic is defined as 


(19) Yv(n) ^ 


n{N — n) 
]V2 




N 




n ^' N — n 

i—1 


where l<n<A^—1. As can be seen from the structure 
of the statistic, it is effectively the difference between two 
sample means: one computed off the first n > 1 data points 
(i.e., Yi,..., Xn), and one computed off the remaining N—n 
data points (i.e., Ai„+i,..., Xj^) in a chunk of A^ > n + 1 
observations Xi,... ,Xn. Therefore the statistic (19) is tai¬ 
lored specifically to detect deviations in the observations’ 
mean. The actual detection procedure consists in compar¬ 
ing mv(n)| indexed byn = 1,...,A^—1 against a thresh¬ 
old selected according to the desired significance level. More 
importantly, the statistic can also be used to estimate the 
actual change-point, z/, i.e., the time moment at which the 
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series’ baseline mean (apparently) changes. Specifically, this 
is accomplished by first identifying the set of values of n for 
which \Y]\j{n)\ is maximized, and then using any such n as 
an estimator of the actual change-point, that is, 

(20) vm — argmax |l)v(n)| . 

l<ra<Ar-l 

It has been shown in [2] that such an estimator enjoys strong 
consistency (as IV —>■ oo) with exponential rate of conver¬ 
gence. 

We have applied the Brodsky-Darkhovsky approach to 
the returns {di}i<i<Ar, and the obtained behavior of 
for l<n<fV—lis shown in Figure 5. It can be seen from 
the figure that the statistic exhibits a whole series of local yet 
fairly contrast maxima. The unique and rather strong ab¬ 
solute maximum occurring around the first quarter of 2003 
reinforces the observation made earlier that the HST stock 
undergoes a structural break at that time. The specific lo¬ 
cation of the absolute maximum corresponds to March 14, 
2003, which is the Brodsky-Darkhovsky estimate of the ac¬ 
tual change-point. We note that this date is precisely the 
2003 break-point. 



Figure 5: Behavior of the Brodsky-Darkhovsky statistic for 
the HST stock series. 

To continue our analysis of the Brodsky-Darkhovsky 
statistic shown in Figure 5, the spike occurring in the sec¬ 
ond half of 2001 can be attributed to the 2001 HST stock 
crash caused by the 9/11 attacks in NYC. The specific value 
of the Brodsky-Darkhovsky estimate of this change-point is 
September 18, 2001, which is within the same week of the 
9/11 attacks. This estimate can be refined using the follow¬ 
ing strategy. Once the absolute maximum of the Brodsky- 
Darkhovsky statistic is identified, the series is partitioned 


into two nonoverlapping segments: one composed of the ob¬ 
servations up to the change-point and one consisting of the 
observations following the change-point. Then the Brodsky- 
Darkhovsky detection-estimation method is applied again 
individually to each of the two data chunks. It is argued 
in [2, 3] that this “divide and conquer” type of an approach 
also yields a strongly consistent (as the sample size gets in¬ 
finitely large) estimator of the change-point. 

We now follow this strategy and analyze each piece of 
data separately. To that end, for the data segment to the left 
of the 2003 break-point the sample mean and standard devi¬ 
ation are /loo ~ —0.0029 and doo ~ 0.2266, respectively. The 
same sample characteristics for data segment to the right of 
the 2003 break-point turned out to be /to « 0.0199 and 
do ~ 0.2306. Therefore, the 2003 break-point changes not 
only the mean but also the variance. However, the change 
in the mean is far more contrast than the change in the 
variance. This could be part of the reason for the excellent 
performance of the Brodsky-Darkhovsky statistic (19). 

Figures 6 show the empirical probability densities (his¬ 
tograms) for the returns before (see Figure 6a) and after 
(see Figure 6b) the 2003 event. Each of the two figures is 
also accompanied with a Gaussian fit with the mean and 
variance set to the respective estimated values. Since the 
two histograms are close to the Gaussian fits, there is only 
one conclusion to draw: the returns do behave as if they 
were generated by a Gaussian process. 

The same conclusion can be drawn from an eye inspection 
of the corresponding Q-Q plots (quantile-quantile) shown in 
Figure 7. Specifically, the Q-Q plot for the distribution of 
the daily returns before the onset of the drift is shown in 
Figure 7a and the Q-Q plot for the distribution with the 
drift in effect is shown in Figure 7b. Since both plots use 
centered and scaled data, the fitted Gaussian distribution is 
the standard normal distribution. The fact that both plots 
are effectively a straight line is evidence of the “Gaussian- 
ness” of the return distribution before and after the drift. 

Another important question to be examined about the 
time series at hand concerns the series’ correlation structure. 
To that end. Figure 8 shows the correlation plot for the HST 
stock daily return series. Specifically, the plot distinguishes 
whether the data are before the 2003 event or after the 2003, 
and shows the autocorrelation function for the former piece 
in black and for the latter one in gray. It is clear from the plot 
that the data are essentially random throughout the entire 
set, as they exhibit no strong structure or correlation. 

To reinforce the “no-correlation” conclusion arrived at 
from Figure 8, Figure 9 provides a selection of lag plots for 
the data, for lags equal to 1, 2, 3, 11, and 13. According 
to Figure 8, these are the lags at which the data correlation 
function may be considered statistically significant (with the 
level of significance being 95%). To clear this out, the scat¬ 
ter plots shown in Figures 9 are to offer additional insight 
into the correlation structure of the time series under con¬ 
sideration. As in Figure 8 above, in Figure 9 the data are 
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(a) Lag-1 plot. (b) Lag-2 plot. (c) Lag-3 plot. 




(d) Lag-11 plot. 


(e) Lag-13 plot. 


Figure 9: Lag plots (scatter plots) for the HST stock returns. 


also split into two categories—before the 2003 event and 
after—and the two categories are distinguished using black 
color for the first category (before the 2003 event) and gray 
for the second one (after the 2003 event). The lack of any 
apparent patters in any of the five lag plots is an indication 
that the HST stock daily return series exhibits no temporal 
correlation. 

4.4 Online structural break detection 

We are now in a position to devise the change-point de¬ 
tection methodology of Section 3 to detect changes in the 
statistical pattern of the HST returns. To assess the perfor¬ 
mance of our detection methods, we will measure the de¬ 
tection delay relative to the change-point estimated by the 
Brodsky-Darkhovsky estimator (20) above. Recall that we 
are interested in comparing two score-based change-point 
detection procedures: the CUSUM chart given by (15)-(16) 
and the Shiryaev-Roberts (SR) procedure given by (13)- 
(14). Selecting the score function as in (17)-(18) for either 
procedure, we have implemented both detection methods 
in MATLAB, the well-known scientific computing platform 
developed by MathWorks, Inc. (see on the Web at http: 
//www.mathworks. com). Since the above analysis of the 
data resulted in the conclusion that the data do follow a 


Gaussian model (before as well as after the change), to set 
up the detection threshold of the CUSUM chart and the SR 
procedure we assumed the Gaussian model with the param¬ 
eters chosen as estimated in the above analysis. Via a simple 
Monte Carlo experiment we estimated that setting A « 60 
and h « 0.3 ensures that the ARL to false alarm of either 
procedure is approximately 7 samples, which is roughly a 
week, since the timescale is working days. 

The detection process is illustrated in Figure 10. Specifi¬ 
cally, Figure 10a shows the behavior of the SR statistic in a 
short time window covering March 13, 2003, i.e., the date at 
which the HST stock underwent the change we would like to 
detect. Such a “zoomed-in” scale is to better illustrate the 
dynamics of the detection statistic around the change-point. 
Figure 10b shows the same but for the CUSUM statistic. We 
see that both procedures successfully detect the onset of the 
drift (occurring on March 13, 2003), and the detection de¬ 
lays are about one day each. 

To draw a line under this section, we would like to re¬ 
mark that the dynamics of the CUSUM statistic is gener¬ 
ally more informative than the dynamics of the SR statis¬ 
tic; compare, e.g.. Figure 10b showing the CUSUM statis¬ 
tic and Figure 10a showing the corresponding SR statistic. 
Specifically, a mere eye examination of the behavior of the 
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(a) Before the 2003 event. 



Figure 6: Empirical probability densities for the HST stock 
returns with Gaussian fits. 



(a) Before the 2003 event. 



(b) After the 2003 event. 


Figure 7: Q-Q plots for the HST stock return distribution 
vs. the standard Gaussian distribution. 


GUSUM statistic as a function of time allows not only to 
see whether the change has occurred or not, but to also 
estimate the time of its occurrence, i.e., the change-point: 
it is likely to be somewhere between the time instance the 
GUSUM statistic last hit zero and the point at which the 
statistic hit (or went above) the detection threshold (i.e., 
the point of alarm). Indeed, on the one hand, the change- 
point is unlikely to be past the point of alarm. On the other 
hand, as we discussed in the previous section, the GUSUM 
statistic is effectively a random walk with the “instanta¬ 


neous” LLRs being the increments. Since the LLRs are, on 
average, negative if no change is in effect, and positive oth¬ 
erwise, the drift of the random walk the GUSUM chart uses 
for its decision-making is negative before the change and 
positive after. As a result, the GUSUM statistic effectively 
estimates zero in the pre-change regime, because zero is its 
reflection barrier: every time the GUSUM statistic hits zero 
it resets itself completely “forgetting” everything it had pre¬ 
viously “learned” about the data. This equips the GUSUM 
chart with a built-in resetting mechanism: if after a sufS- 
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Figure 8: Autocorrelation function for the HST stock re¬ 
turns. 


ciently long period of surveillance the data have given no 
indication of a change, the CUSUM statistic will likely reset 
itself (by hitting zero), i.e., it will discard the entire history 
of observations made up to that point and start completely 
anew. Hence, the change-point is unlikely to be to the left of 
the latest point at which the CUSUM statistic visited zero. 
This intrinsic self-restarting feature is the main reason for 
the exact minimax optimality (in the sense of Lorden [21]) 
of the CUSUM chart established in [24, 37]. By contrast, 
the SR statistic when plotted against time does not offer 
this kind of convenience of interpretation, for the SR pro¬ 
cedure’s decision-making mechanism uses entirely different 
principles. Nevertheless, the SR procedure is exactly multi- 
cyclic optimal, and the CUSUM chart is not. Therefore, 
when it comes to monitoring processes that are unlikely to 
undergo a structural break for a long period of time, so that 
change-point detection has to be performed in cycles, basing 
surveillance on the SR procedure might be a better option 
than going with the CUSUM chart. 

5. CONCLUSION 

We considered the problem of rapid but reliable anomaly 
detection in “live” financial data. We treated the problem 
statistically, viz. as that of quickest change-point detection, 
and proposed an anomaly-detection method that derives 
from the multi-cyclic (repeated) Shiryaev-Roberts (SR) de¬ 
tection procedure. We decided to go with this largely ne¬ 
glected near-coeval of the celebrated CUSUM and EWMA 
charts because of the strong multi-cyclic optimality proper¬ 
ties that the SR procedure was recently discovered to have 
under the basic iid change-point detection setup; no such 



(a) By the SR procedure. 



(b) By the CUSUM chart. 

Figure 10: Detection of the 2003 anomaly in the HST stock 
by the SR and CUSUM procedures. 


properties are exhibited by either the “good old” CUSUM 
“inspection scheme” or the EWMA chart. To handle real- 
world financial data, the proposed SR-derivative utilizes the 
information contained in the data in the SR-like Bayesian 
manner with the likelihood ratio replaced with a change- 
sensitive score function. This simple idea allowed the pro¬ 
posed procedure to preserve the low computational com¬ 
plexity of its prototype—the original SR procedure. More 
importantly, we carried out a case study where the proposed 
procedure was devised to detect an anomaly in a real-world 
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financial time series, and the obtained experimental results 
indicated that our procedure may have also preserved the 
great “false alarm risk”-vs.-“detection speed” capabilities of 
the original SR procedure. 

ACKNOWLEDGEMENTS 

The authors would like to thank the two Guest Editors, 
Prof. Shouyang Wang (Institute of Systems Science, Bei¬ 
jing, P.R. China) and Prof. Anatoly Zhigljavsky (Cardiff 
University, Cardiff, Wales, UK), and the Editor-in-Chief, 
Prof. Hoping Zhang (Yale University, New Haven, Connecti¬ 
cut, USA), for the time and effort they invested to produce 
this special issue of the Journal. The authors are also per¬ 
sonally thankful to Prof. Zhigljavsky for the invitation to 
contribute this work to the special issue. The constructive 
feedback provided by the two anonymous referees is greatly 
appreciated as well. 

The effort of A.S. Polunchenko was supported, in part, by 
the Simons Foundation (www. simonsf oundation. org) via a 
Collaboration Grant in Mathematics (Award ^304574). 

A.S. Polunchenko is also equally indebted to the Office of 
the Dean of the Harpur College of Arts and Sciences at the 
State University of New York at Binghamton for the support 
provided through the Dean’s Research Semester Award for 
Junior Faculty granted for the Fall semester of 2014. 

The work of A. Pepelyshev was partly supported by 
the St. Petersburg State University, Russia, under project 
#6.38.435.2015. 

Received January 2015 

REFERENCES 

[1] Basseville, M. and Nikieorov, I. V. (1993). Detection of Abrupt 
Changes: Theory and Application. Prentice Hall, Englewood 
Cliffs, NJ. 

[2] Brodsky, B. E. and Darkhovsky, B. S. (1993). Nonparametric 
Methods in Change-Point Problems. Mathematics and Its Apli- 
cations 243. Kluwer Academic Publishers, Norwell, MA. 

[3] Brodsky, B. E. and Darkhovsky, B. S. (2000). Non-Parametric 
Statistical Diagnosis: Problems and Methods. Mathematics and 
Its Aplications 509. Kluwer Academic Publishers, Norwell, MA. 

[4] Dragalin, V. P. (1994). Optimality of a generalized CUSUM pro¬ 
cedure in quickest detection problem. In Proceedings of the Steklov 
Institute of Mathematics: Statistics and Control of Random Pro¬ 
cesses., 202 107-120. American Mathematical Society, Providence, 
RI. 

[5] Du, W., Polunchenko, A. S. and Sokolov, G. (2015). On Ro¬ 
bustness of the Shiryaev—Roberts Change-Point Detection Proce¬ 
dure under Parameter Misspecification in the Post-Change Distri¬ 
bution. Communications in Statistics—Simulation and Compu¬ 
tation. (in press). Available online at: http://www.tandfonline. 
com/doi/full/lO.1080/03610918.2015.1039131. 

[6] Ergashev, B. a. (2004). Sequential Detection of US Business Cy¬ 
cle Turning Points: Performances of Shiryayev—Roberts, CUSUM 
and EWMA Procedures. Available online at the Economics Work¬ 
ing Paper Archive (EconWPA): https://ideas.repec.Org/p/ 
wpa/wuwpein/0402001 .html. 

[7] Frisen, M. (2008). Financial Surveillance. John Wiley Sons, 
Inc., Hoboken, NJ. 


[8] Frisen, M. (2009). Optimal Sequential Surveillance for Finance, 
Public Health, and Other Areas. Sequential Analysis 28 310-337. 

[9] Girschick, M. a. and Rubin, H. (1952). A Bayes approach to 
a quality control model. Annals of Mathematical Statistics 23 
114-125. 

[10] Golyandina, N., Nekrutkin, V. and Zhigljavsky, A. A. (2001). 
Analysis of Time Series Structure: SSA and related techniques. 
Monographs on Statistics and Applied Probability 90 . Chapman 
& Hall/CRC, London, UK. 

[11] Golyandina, N. and Zhigljavsky, A. (2013). Singular Spectrum 
Analysis for Time Series. Springer Briefs in Statistics. Springer. 

[12] Gordon, L. and Pollak, M. (1994). An Efficient Sequential Non- 
parametric Scheme for Detecting a Change of Distribution. An¬ 
nals of Statistics 22 763-804. 

[13] Gordon, L. and Pollak, M. (1995). A Robust Surveillance 
Scheme for Stochastically Ordered Alternatives. Annals of Statis¬ 
tics 23 1350-1375. 

[14] Host Hotels & Resorts, Inc. (2001). US Securities and Ex¬ 
change Commission, Form 10-K, Tax Year 2001. 

[15] Host Hotels Resorts, Inc. (2002). Annual Report 2002. 

[16] Host Hotels Sz Resorts, Inc. (2003). Annual Report 2003. 

[17] Host Hotels & Resorts, Inc. (2014). US Securities and Ex¬ 
change Commission, Form 10-K, Tax Year 2014. 

[18] Lai, T. L. (1995). Sequential changepoint detection in quality 
control and dynamical systems (with discussion). Journal of the 
Royal Statistical Society. Series B. Methodological 57 613-658. 

[19] Lai, T. L. (1998). Information bounds and quick detection of 
parameter changes in stochastic systems. IEEE Transactions on 
Information Theory 44 2917-2929. 

[20] Lai, T. L. and Xing, H. (2015). Active Risk Management: Finan¬ 
cial Models and Statistical Methods. Chapman and Hall/CRC Fi¬ 
nancial Mathematics Series. Chapman Hall/CRC Press, Boca 
Raton, FL. 

[21] Lorden, G. (1971). Procedures for reacting to a change in distri¬ 
bution. Annals of Mathematical Statistics 42 1897-1908. 

[22] McDonald, D. (1990). A CUSUM Procedure Based on Sequen¬ 
tial Ranks. Journal of Naval Research 37 627—646. 

[23] Moskvina, V. and Zhigljavsky, A. (2003). An Algorithm Based 
on Singular Spectrum Analysis for Change-Point Detection. Com¬ 
munications in Statistics — Simulation and Computation 32 319- 
352. 

[24] Moustakides, G. V. (1986). Optimal stopping times for detecting 
changes in distributions. Annals of Statistics 14 1379-1387. 

[25] Page, E. S. (1954). Continuous Inspection Schemes. Biometrika 
41 100-115. 

[26] Pollak, M. (1985). Optimal detection of a change in distribution. 
Annals of Statistics 13 206—227. 

[27] Pollak, M. (1987). Average run lengths of an optimal method of 
detecting a change in distribution. Annals of Statistics 15 749- 
779. 

[28] Pollak, M. (2010). A Robust Changepoint Detection Method. 
Sequential Analysis 29 146-161. 

[29] Pollak, M. and Krieger, A. M. (2013). Shewhart Revisited. 
Sequential Analysis 32 230-242. 

[30] Pollak, M. and Tartakovsky, A. G. (2008). Exact Optimality 
of the Shiryaev—Roberts Procedure for Detecting Changes in Dis¬ 
tributions. In Proceedings of the 2008 International Symposium 
on Information Theory and Its Applications 1—6. 

[31] Pollak, M. and Tartakovsky, A. G. (2009). Optimality Proper¬ 
ties of the Shiryaev—Roberts procedure. Statistica Sinica 19 1729- 
1739. 

[32] Polunchenko, A. S., Sokolov, G. and Du, W. (2013). Quickest 
Change-Point Detection: A Bird’s Eye View. In Proceedings of the 
2013 Joint Statistical Meetings. 

[33] Polunchenko, A. S. and Tartakovsky, A. G. (2010). On opti¬ 
mality of the Shiryaev—Roberts procedure for detecting a change 
in distribution. Annals of Statistics 38 3445—3457. 

[34] Polunchenko, A. S. and Tartakovsky, A. G. (2012). State-of- 
the-Art in Sequential Change-Point Detection. Methodology and 


Financial surveillance via change-point detection methods 13 


Computing in Applied Probability 14 649-684. 

[35] POLUNCHENKO, A. S., Tartakovsky, A. G. and Mukhopad- 
HYAY, N. (2012). Nearly Optimal Change-Point Detection with an 
Application to Cybersecurity. Sequential Analysis 31 409-435. 

[36] Poor, H. V. and Hadjiliadis, O. (2009). Quickest Detection. 
Cambridge University Press, New York, NY. 

[37] Ritov, Y. (1990). Decision theoretic optimality of the CUSUM 
procedure. Annals of Statistics 18 1464-1469. 

[38] Roberts, S. W. (1959). Control chart tests based on geometric 
moving averages. Technometrics 1 239—250. 

[39] Roberts, S. W. (1966). A comparison of some control chart pro¬ 
cedures. Technometrics 8 411-430. 

[40] Shewhart, W. a. (1925). The application of statistics as an aid 
in maintaining quality of a manufactured product. Journal of the 
American Statistical Association 20 546—548. 

[41] Shewhart, W. A. (1931). Economic Control of Quality of Manu¬ 
factured Product. Bell Telephone Laboratories series. D. Van Nos¬ 
trand Company, Inc., Princeton, NJ. 

[42] Shiryaev, a. N. (1961). The problem of the most rapid detection 
of a disturbance in a stationary process. Soviet Mathematics- 
Doklady 2 795—799. Translation from Dokl. Akad. Nauk SSSR 
138:1039-1042, 1961. 

[43] Shiryaev, A. N. (1963). On optimum methods in quickest de¬ 
tection problems. Theory of Probability and Its Applications 8 
22-46. 

[44] Shiryaev, A. N. (1978). Optimal Stopping Rules. Springer- 
Verlag, New York, NY. 

[45] Shiryaev, A. N. and Zryumov, P. Y. (2010). On the Linear and 
Nonlinear Generalized Bayesian Disorder Problem (Discrete Time 
Case). In Optimality and Risk — Modern Trends in Mathematical 
Finance (F. Delbaen, M. Rasonyi and C. Strieker, eds.) 227—236. 
Springer Berlin Heidelberg. 

[46] SiEGMUND, D. (1985). Sequential Analysis: Tests and Confidence 
Intervals. Springer Series in Statistics. Springer-Verlag, New 
York, NY. 

[47] Tartakovsky, A., Nikiforov, I. and Basseville, M. (2014). 
Sequential Analysis: Hypothesis Testing and Changepoint Detec¬ 
tion. Monographs on Statistics and Applied Probability 166 . CRC 
Press, Boca Raton, FL. 

[48] Tartakovsky, A. G. (2005). Asymptotic performance of a mul¬ 
tichart CUSUM test under false alarm probability constraint. In 
IEEE Conference on Decision and Control 44 320-325. 

[49] Tartakovsky, A. G. and Moustakides, G. V. (2010). State-of- 
the-Art in Bayesian Changepoint Detection. Sequential Analysis 
29 125-145. 

[50] Tartakovsky, A. G., Pollak, M. and Polunchenko, A. S. 
(2012). Third-order asymptotic optimality of the Generalized 
Shiryaev-Roberts changepoint detection procedures. Theory of 
Probability and Its Applications 56 457—484. 

[51] Tartakovsky, A. G. and Polunchenko, A. S. (2010). Minimax 
Optimality of the Shiryaev-Roberts Procedure. In Proceedings of 
the 5th International Workshop on Applied Probability. 

[52] Tartakovsky, A. G., Polunchenko, A. S. and Sokolov, G. 
(2013). Efficient Computer Network Anomaly Detection by 


Changepoint Detection Methods. IEEE Journal of Selected Top¬ 
ics in Signal Processing 7 4-11. 

[53] Tartakovsky, A. G., Rozovskii, B. L., Blazek, R. B. and 
Kim, H. (2006). A novel approach to detection of intrusions in 
computer networks via adaptive sequential and batch-sequential 
change-point detection methods. IEEE Transactions on Signal 
Processing 54 3372-3382. 

[54] Tartakovsky, A. G., Rozovskii, B. L., Blazek, R. B. and 
Kim, H. (2006). Detection of intrusions in information systems 
by sequential changepoint methods (with discussion). Statistical 
Methodology 3 252-340. 

[55] Veeravalli, V. V. and Banerjee, T. (2013). Quickest Change 
Detection. In Academic Press Library in Signal Processing: Array 
and Statistical Signal Processing, (R. Chellappa and S. Theodor- 
idis, eds.) 3 209-256. Academic Press, Oxford, UK. 

[56] WiLLSKY, A. S. and Jones, H. L. (1976). A generalized likelihood 
ratio approach to detection and estimation of jumps in linear sys¬ 
tems. IEEE Transactions on Automatic Control 21 108-112. 

[57] WOODROOFE, M. (1982). Nonlinear Renewal Theory in Sequen¬ 
tial Analysis. Society for Industrial and Applied Mathematics, 
Philadelphia, PA. 

[58] Zhigljavsky, a. (2009). Application of the Singular Spectrum 
Analysis for Change-point Detection in Time Series. In Proceed¬ 
ings of the 2nd International Workshop in Sequential Methodolo¬ 
gies. 

[59] Zhigljavsky, A. A. and Kraskovsky, A. E. (1988). Detection 
of Abrupt Changes of Random Processes in Radiotechnics Prob¬ 
lems. St. Petersburg University Press, St. Petersburg, Russia, (in 
Russian). 

Andrey Pepelyshev 

Faculty of Mathematics 

St. Petersburg State University 

Peterhof, St. Petersburg 198504 

Russia 

School of Mathematics 
Cardiff University 
Cardiff, CF24 4AG 
UK 

E-mail address: andrey@ap7236.spb.edu 

Aleksey S. Polunchenko 
Department of Mathematical Sciences 
State University of New York at Binghamton 
Binghamton, New York 13902-6000 
USA 

E-mail address: aleksey@binghaiiiton.edu 
url : http://www.math.binghamton.edu/aleksey 


14 A. Pepelyshev and A.S. Polunchenko 


